*****************************************************************
*****************************************************************
** READ ME FILE FOR
SPENDING RESPONSE TO A PREDICTABLE INCREASE IN MORTGAGE REPAYMENTS:
EVIDENCE FROM EXPIRING INTEREST-ONLY LOANS			*
*****************************************************************
** AUTHORS:					*
**	HENRIK YDE ANDERSEN					*
**	STINE LUDVIG BECH					*
**	ALESSIA DE STEFANI					*
*****************************************************************
** October   2021 *
*****************************************************************
*****************************************************************
****DATA SOURCES AND DATA ACCESS************************

Data for this project is derived from the Danish administrative registries.
This data is maintained and administered by Statistics Denmark and it is proprietary.
All researchers affiliated with Danish research institutions can request and obtain access to the registries ( forskningsservice@dst.dk).
Several individual registries should be combined to replicate this analysis: These registries are detailed below.



******** Codes required to obtain final dataset, in sequence*************

*A Develop property dataset
*0a_ChangeOfProperty (SAS)
This step creates an indicator for whether a person has engaged in a property trade during the year. 
The dataset is input in step 0c.

Registries used as inputs:
1. EJER 
2. BEF

Output: famejendomshandler

*B Develop mortgage dataset

*0b_MortgageLoans (SAS)
This step takes yearly REAL datasets and creates yearly datasets containing variuos new descriptive variables.
The code takes loan- and individual level data and aggregates to individual level data.
The dataset is input in step 0c.

Registries used as inputs:
Inputs: 1. REAL

Output: rk_Individdata

*C Develop household-level dataset
*0c_Familydata (SAS)
The dataset is compiled and maintained by Danmarks Nationalbank on the basis of multiple registries from Statistics Denmark and the datasets from step 0a and 0b. 
The code takes individual level data and aggregates to household level. The household identifier is from Statistics Denmark.
It generates the conditioning variables we use. 
Inputs: 1. Multiple individual and household-level registers from Statistics Denmark (details available upon request)
	2. famejendomshandler (from step 0a)
	3. rk_Individdata (from step 0b)

Output: familiedata

*D Merge family, property and mortgage datasets; aggregate variables, select sample 

*1_Extractdata (SAS)
This step identifies the sample used in our analysis, and it creates the dataset to be used in the analysis. 
The dataset aggregates loan- and individual level data from REAL to household level.

Registries used as inputs:
Inputs:	1. REAL 
	2. BEF
	3. FAMILIEDATA (from step 0c)
1. Macro 'raadata':
	- reading REAL datasets and creating variables to be used later
2. Macro 'overlev'
	- reading REAL data in year t and t-1 to identify the loans in year t that survived
3. Macro 'sample'
	- consists of all loans that start with an IO period and lasts at least nine years
4. Macro 'bef'
	- getting familie_id

Output: master.dta

*E Define outcomes, controls, for baseline sample

*4_VarDefinition (STATA 15)
Cleans master data and defines variables used in the paper.

Input: 	1. master.dta

Output: analysis.dta



**********************************************************************************************
*D_a Replicate step D but for a different sample (used in table A8 and Appendix A2 only ) 

*2_Robust_IOeightyears_SAS	(SAS)
Creates a dataset with families taking up a loan in year t-10 and measuring the length of the
latest period of IO. Similar to the one in step 1 except for the condition on length of IO period.
The dataset aggregates loan- and individual level data from REAL to household level.
Inputs:	1. REAL 
	2. BEF
	3. FAMILIEDATA (from step 0b)
1. Macro 'raadata':
	- reading REAL datasets and creating variables to be used later
2. Macro 'overlev'
	- reading REAL data in year t and t-1 to identify the loans in year t that survived
3. Macro 'sample'
	- consists of all loans that start with an IO period and lasts at least nine years
4. Macro 'bef'
	- getting familie_id


Output: Robust_IOeightyears.dta

*E_a Define outcomes, controls, for control sample
*3_Robust_IOeightyears_STATA (STATA 15)
Cleans the data on the early amortizers to create the dataset controlgroup.dta to be used in Table A8 and Appendix A2. 

Input:	1. Robust_IOeightyears.dta

Output: controlgroup.dta
*********************************************************

*E Regressions and charts used in the paper

*5_Analysis (STATA 15)

Replicates the results in the paper and the two appendices

Input: 	1. analysis.dta
	2. controlgroup.dta

